

# Face detection accelerator ASIC

School of Electronic and Electrical Engineering, Kyungpook National University

Jongkil Hyun, Kyeong-Kuk Min, and Byungin Moon

#### Abstract

Identity verification embedded systems that can be used to resolve the increasing problem of missing children are advancing. To develop these systems, <u>face</u> <u>detection that can extract only the face region is required</u>. Therefore, in this study, a <u>face detection accelerator</u> was designed based on the adaptive boost (AdaBoost) training algorithm <u>to detect only the face region</u> from an input image acquired by a camera [1][2]. The designed face detection accelerator can detect

faces of various sizes in the input image by adopting the image pyramid method. In addition, the designed face detection accelerator can be operated with low area and low power. In particular, when the face detection accelerator is developed as an application-specific integrated circuit (ASIC), its utility in various edge devices that require low-power operation increases. For this reason, an AdaBoost-based face detection accelerator was developed as an ASIC through the Samsung 28-nm fabrication process.

#### Hardware Architecture

As shown in Figure 1, the hardware architecture consists of eight modules: 1) input/output interface, 2) frame buffer, 3) scaler, 4) address generator, 5) line buffer, 6) integral image generator, 7) cascade classifier, and 8) merger.

In the Integral Image Generator module, <u>the word length reduction method [1] was</u> <u>applied to reduce memory usage</u> by 50% compared with the conventional integral image generation method. <u>For improving the processing speed</u>, the Cascade Classifier module is designed <u>to perform classification operations quickly by repeatedly using nine parallel</u> <u>weak classifiers.</u> In addition, the Cascade Classifier module <u>adopts the skip scheme</u> to improve the processing speed [2].

#### **ASIC Design**

To develop an application-specific integrated circuit (ASIC), we designed the hardware architecture using Verilog HDL and verified the real-environment operation based on FPGA. Table 1 shows the design specifications of the face detection accelerator. The gate counter is the result of synthesis with the design complier, and the power is the result reported through ICC2 after the layout. The ASIC development was performed as shown in Table 2, and the final chip layout is shown in Figure 2.



Figure 1. Hardware architecture of the face detection accelerator

| Specif           | ications                           |  |
|------------------|------------------------------------|--|
| Image resolution | 1920 ×1080                         |  |
| Frequency        | 148.5 MHz                          |  |
| Memory           | 507 SRAMs of 4096×16               |  |
|                  | 21 SRAMs of 640×8                  |  |
|                  | 9 ROMs of 256×60                   |  |
|                  | 9 ROMs of 256×18                   |  |
|                  | 9 ROMs of 256×24                   |  |
|                  | 1 ROMs of 256×27                   |  |
| Gate count       | 1,100,201                          |  |
| Power            | 43.2 mW                            |  |
| Die size         | $4 \text{ mm} \times 4 \text{ mm}$ |  |

### **Test Environment for Chip Operation Verification**

Figure 3 shows the test environment for verifying the chip operation. The ASIC chip is mounted on the chip test board and interconnected with the field-programmable gate array (FPGA) board. The FPGA board transmits the test image to the ASIC test board and receives 1 the computed image from the ASIC. Thereafter, the FPGA board transmits the computed image to a personal computer (PC) through the interface board. Finally, verification is performed by confirming the ASIC chip's operation by checking the PC's computed image.

### Conclusion

In this study, we developed an ASIC based on a face detection accelerator that adopts word length reduction, nine parallel weak classifiers, and the skip scheme. When verifying the operation using an FPGA, it was confirmed that our proposed <u>face detection accelerator is</u> <u>able to operate at a speed of 10 frames per second (fps) for FHD resolution.</u> In addition, it can detect small faces because it has a 60 x 60 minimum bounding box size.

#### Reference

[1] J. Kim, J. Hyun, and B. Moon, "Low-cost Hardware Architecture for Integral Image Generation using Word Length Reduction," in Proc. Int. SoC Design Conf. (ISOCC), pp. 119-120, 2020.
[2] J. Hyun, J. Kim, C.-H. Choi, and B. Moon, "Hardware Architecture of a Haar Classifier Based Face Detection System Using a Skip Scheme," in Proc. International Symposium on Circuits and Systems (ISOCC), pp. 1-4, 2021.

#### Table 1. ASIC design specification

| Phase         | Task | Description                        |                |
|---------------|------|------------------------------------|----------------|
| Front-<br>end | 1    | RTL Design & Function Simulation   |                |
|               | 2    | Synthesis                          |                |
|               | 3    | Design Rule Check                  |                |
|               | 4    | Formal Verification                |                |
|               | 5    | Pre-layout Static Timing Analysis  |                |
|               | 6    | Pre-layout Simulation with SDF     |                |
|               | 7    | Place & Route                      |                |
|               | 8    | RC Extraction                      |                |
| Back-         | 9    | Post-layout Static Timing Analysis |                |
| end           | 10   | Post-layout Simulation with SDF    |                |
|               | 11   | Static Power Analysis              |                |
|               | 12   | Physical Verification              |                |
|               |      |                                    | d <del>a</del> |

Table 2. ASIC development process and task



Figure 2. Chip layout



## Acknowledgement

The chip fabrication and EDA tool were supported by the IC Design Education Center(IDEC), Korea.

This research was supported by the Multi-Ministry Collaborative R&D program (R&D program for complex cognitive technology) through the National Research Foundation of Korea (NRF) funded by Ministry of Trade, Industry and Energy (NRF-2018M3E3A1057248).





